A Theoretical Analysis of Feature Pooling in Visual Recognition
نویسندگان
چکیده
Many modern visual recognition algorithms incorporate a step of spatial ‘pooling’, where the outputs of several nearby feature detectors are combined into a local or global ‘bag of features’, in a way that preserves task-related information while removing irrelevant details. Pooling is used to achieve invariance to image transformations, more compact representations, and better robustness to noise and clutter. Several papers have shown that the details of the pooling operation can greatly influence the performance, but studies have so far been purely empirical. In this paper, we show that the reasons underlying the performance of various pooling methods are obscured by several confounding factors, such as the link between the sample cardinality in a spatial pool and the resolution at which low-level features have been extracted. We provide a detailed theoretical analysis of max pooling and average pooling, and give extensive empirical comparisons for object recognition tasks.
منابع مشابه
Recognition of Visual Events using Spatio-Temporal Information of the Video Signal
Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...
متن کاملMachine learning based Visual Evoked Potential (VEP) Signals Recognition
Introduction: Visual evoked potentials contain certain diagnostic information which have proved to be of importance in the visual systems functional integrity. Due to substantial decrease of amplitude in extra macular stimulation in commonly used pattern VEPs, differentiating normal and abnormal signals can prove to be quite an obstacle. Due to developments of use of machine l...
متن کاملLearning Hierarchical Feature Extractors For Image Recognition
Telling cow from sheep is effortless for most animals, but requires much engineering for computers. In this thesis, we seek to tease out basic principles that underlie many recent advances in image recognition. First, we recast many methods into a common unsupervised feature extraction framework based on an alternation of coding steps, which encode the input by comparing it with a collection of...
متن کاملGradient Driven Learning for Pooling in Visual Pipeline Feature Extraction Models
Hyper-parameter selection remains a daunting task when building a pattern recognition architecture which performs well, particularly in recently constructed visual pipeline models for feature extraction. We re-formulate pooling in an existing pipeline as a function of adjustable pooling map weight parameters and propose the use of supervised error signals from gradient descent to tune the estab...
متن کاملSupervised Feature Extraction of Face Images for Improvement of Recognition Accuracy
Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images ar...
متن کامل